Structured Variable Selection with Sparsity-Inducing Norms
نویسندگان
چکیده
We consider the empirical risk minimization problem for linear supervised learning, with regularization by structured sparsity-inducing norms. These are defined as sums of Euclidean norms on certain subsets of variables, extending the usual l1-norm and the group l1-norm by allowing the subsets to overlap. This leads to a specific set of allowed nonzero patterns for the solutions of such problems. We first explore the relationship between the groups defining the norm and the resulting nonzero patterns, providing both forward and backward algorithms to go back and forth from groups to patterns. This allows the design of norms adapted to specific prior knowledge expressed in terms of nonzero patterns. We also present an efficient active set algorithm, and analyze the consistency of variable selection for least-squares linear regression in low and high-dimensional settings.
منابع مشابه
STRUCTURED VARIABLE SELECTION WITH SPARSITY-INDUCING NORMS Structured Variable Selection with Sparsity-Inducing Norms
We consider the empirical risk minimization problem for linear supervised learning, with regularization by structured sparsity-inducing norms. These are defined as sums of Euclidean norms on certain subsets of variables, extending the usual l1-norm and the group l1-norm by allowing the subsets to overlap. This leads to a specific set of allowed nonzero patterns for the solutions of such problem...
متن کاملActive Set Algorithm for Structured Sparsity-Inducing Norms
We consider the empirical risk minimization problem for linear supervised learning, with regularization by structured sparsity-inducing norms. These are defined as sums of Euclidean norms on certain subsets of variables, extending the usual l1-norm and the group l1-norm by allowing the subsets to overlap. This leads to a specific set of allowed nonzero patterns for the solutions of such problem...
متن کاملExploring Large Feature Spaces with Hierarchical Multiple Kernel Learning
For supervised and unsupervised learning, positive definite kernels allow to use large and potentially infinite dimensional feature spaces with a computational cost that only depends on the number of observations. This is usually done through the penalization of predictor functions by Euclidean or Hilbertian norms. In this paper, we explore penalizing by sparsity-inducing norms such as the l-no...
متن کاملOptimization with Sparsity-Inducing Penalties
Sparse estimation methods are aimed at using or obtaining parsimonious representations of data or models. They were first dedicated to linear variable selection but numerous extensions have now emerged such as structured sparsity or kernel selection. It turns out that many of the related estimation problems can be cast as convex optimization problems by regularizing the empirical risk with appr...
متن کاملConvex Optimization with Sparsity-Inducing Norms
The principle of parsimony is central to many areas of science: the simplest explanation to a given phenomenon should be preferred over more complicated ones. In the context of machine learning, it takes the form of variable or feature selection, and it is commonly used in two situations. First, to make the model or the prediction more interpretable or computationally cheaper to use, i.e., even...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of Machine Learning Research
دوره 12 شماره
صفحات -
تاریخ انتشار 2011